General Guidelines

Pseudocode should be:

  • Clear
    • Must be written in plainly understood English or as a mathematical expression.
    • All objects must be named and defined prior to using them.
    • Avoid using pronouns.
  • Complete
    • Make sure your pseudocode is complete enough to be useful.
  • Concise
    • Keep the instructions simple and to the point.
    • Remove extraneous steps.
  • Correct
    • The pseudocode must be able to produce code which accurately performs the desired task.

Pseudocoding is an iterative process.

  • Start by making sure you fully understand the goal of the algorithm you are going to implement. You should be able to explain, very briefly, what the inputs and output are, and very broadly how the function gets there. This is critical to making your pseudocode clear.
    • Say it out loud a few times until you feel comfortable with it.
    • Write down, in complete sentences, what you are trying to do. Don’t try to be clever or fancy. Don’t get specific about what things are called, or what functions you are going to use, just write it down how you would say it to a regular human being.
  • Start to formalize your plan into a pseudocode structure.
    • Look at the examples in the reading for guidance.
    • Identify control flow elements in your description and use the appropriate KEYWORDS and indentation.
  • You may identify sections of pseudocode which would be appropriate to make functions in their own right. In your main function, you may refer to these specific functions, but you’ll need to pseudocode these functions seperately later.
  • Make several passes through the pseudocode, expanding on things which are ambiguous. You may make several changes to the structure, change what your loop iterators are (should they be indices or objects?), or decide you need more or different inputs. You’ll repeat this process until you have added all of the details neccesary. At the end of this stage, you should have complete pseudocode.
  • Make a few more passes through your pseudocode weeding out redundant or extraneous instructions. This will make your pseudocode concise.
  • With each step, you should verify you’ve not made any changes which would render your solution incorrect. But, you should do one final pass through, looking for ambiguities, oversights, or errors.
  • As a further (optional) step, since R is a vectorized language, you might look for areas where you could vectorize your pseudocode. Look for FOR loops where the result doesn’t depend on the order in which you do the operations. NOTE: As pseudocode is meant to be language agnostic, this last step is only for your own benefit, something you should do before you start writing your R code.

The Problem

Write a function is_TRUE() which inputs a vector x and, for each element in x, outputs TRUE if and only if that element of x is TRUE and outputs FALSE otherwise.

This, as is, is already a very good, plain English, description of what is to be done. Though there is some ambiguity of languge which might prove problematic were we to simply code it directly as it is written. While it may be understood we need to return a vector, it isn’t explicitly stated.

In fact, taken literally, it says to return a value of TRUE, for each element in our input vector, but a function can only return once and can only return one object. So, a Maliciously Compliant Actor (MCA) might write a function that looks something like this:

Do you see the problem?

We will exit the function in the first iteration every time.

Preparing to Prepare

When communicating among people we (generally) aren’t assuming they are going to take our words hyper-literally or deliberately interpret them in the worst possible way. We don’t have that same luxury when we are dealing with computers, so our first step might (should) be to eliminate as much ambiguity as possible without making it overly complicated or ponderous.

We can do that by explicitly stating we will be returning a vector and being more clear about what is being set to TRUE or FALSE (the elements of the return vector).

The is_TRUE() function inputs a vector x and, returns a vector where, for each element in x, if and only if the element in x is TRUE, the corresponding element of the return vector will be TRUE, otherwise it will be FALSE.

The Initial Pass

For the very first pass all we should do is format our sentence into a pseudocode structure. List the function name, what goes in, what comes out, and then, having identified the control flow elements in our descriptive sentence, start each one on its own line and indent and nest the statements as needed.

Pseudocode

FUNCTION: is_TRUE
INPUTS:   x, a vector.
OUTPUT:   a logical vector whose values are only TRUE or FALSE.

FOR each element in x
    IF the element in x is identical to TRUE
        SET the corresponding element of the output vector to TRUE
    ELSE
        SET the corresponding element of the output vector to FALSE
RETURN the output vector

At this point, we have pseudocode*.

*We have a long way to go before we rest though. While many of you could, no doubt, successfully code a working function from this point, the goal is to make it foolproof. This would generally be less than what we are looking for from you. Let’s move on.

Draw the Rest of the Owl

Making our pseudocode complete.

Pass 1

the element in x

This is a little ambiguous, so we can start out by naming this element, e.

And,

the corresponding element of the output vector

Let’s name this output vector y.

Pseudocode
FUNCTION: is_TRUE
INPUTS:   x, a vector.
OUTPUT:   y, a logical vector whose values are only TRUE or FALSE.

y <- a logical vector the same length as x
FOR each element, e, in x
    IF e identical to TRUE
        SET the corresponding element of y to TRUE
    ELSE
        SET the corresponding element of y to FALSE
RETURN y

Nice! That’s a bit more clear and should be complete. We could generate useful code from this, but what else could be clarified or made more precise?

R Implementation

Pass 2

identically TRUE

What does this mean, exactly?

  • At a minimum it must be equal to TRUE. But 1 == TRUE too.
  • So, it must also have a logical type.
  • Lastly, the IF will throw an error if it cannot make a TRUE or FALSE determination so we need the value to not be unknown.
Pseudocode
FUNCTION: is_TRUE
INPUTS:   x, a vector.
OUTPUT:   a logical vector whose values are only TRUE or FALSE.

y <- a logical vector the same length as x
FOR each element, e, in x
    IF e = TRUE AND e is a logical value AND e is not unknown
        SET the corresponding element of y to TRUE
    ELSE
        SET the corresponding element of y to FALSE
RETURN y

Alright, this is looking much more clear and feels complete and, if you wanted to, you could go ahead and code this now. Though, this is not concise enough to give us great results.

The implementation I’ve provided here is a direct translation, you can write a vectorized version of this pseudocode, but it’s better to eliminate the extraneous parts of the pseudocode before you start thinking about vectorization.

Results

Now, let’s check how we did…

[1]  TRUE FALSE    NA
[1]  TRUE FALSE FALSE
[1] -1  0  1
[1] FALSE FALSE FALSE

Cleaning Up

Making our pseudocode more concise.

Take a look at the current state of our pseudocode:

FUNCTION: is_TRUE
INPUTS:   x, a vector.
OUTPUT:   a logical vector whose values are only TRUE or FALSE.

y <- a logical vector the same length as x
FOR each element, e, in x
    IF e = TRUE AND e is a logical value AND e is not unknown
        SET the corresponding element of y to TRUE
    ELSE
        SET the corresponding element of y to FALSE
RETURN y

Do you notice anything about our conditional?

Pass 3

When you set an object to TRUE or FALSE based on the condition in an IF statement, you can generally eliminate the IF and just set the object’s value based on the condition. See the simplified example code below:

Do you see how, if a == TRUE then b == TRUE and if a == FALSE then b == FALSE? Woulding it be simpler to just set b <- a?

Also, consider this code:

Do you notice the == TRUE part is redundant?

I claim that a == TRUE is logically equivalent to a regardless of what a is. Think it through until you have convinced yourself of this fact.

Now, let’s implement these changes in our pseudocode.

Pseudocode
FUNCTION: is_TRUE
INPUTS:   x, a vector.
OUTPUT:   a logical vector whose values are only TRUE or FALSE.

y <- a logical vector the same length as x
FOR each element, e, in x
  corresponding element of y <- e = TRUE AND e is a logical value AND e is not unknown
RETURN y

Now this is something which we can easily and cleanly implement both directly and as a vectorized R solution.

R Implementation (Vectorized)

Bonus pseudocode!

While we’ve said pseudocode is language agnostic (and that is, by and large, true) if you program extensively or exclusively in one language you’ll be okay if, in the pseudocode you write for yourself, you include language specific ideas or capabilities. The pseudocode police won’t be busting down your door if you mentiion certain functions you intend to use or if you plan your vectorization out prior to starting to code. So here is a possible pseudocode example for this problem, tailored specifically for R.

We can generally vectorize a for loop by removing the looping structure and addressing the objects as wholes rather than their elements individually.

Vectorization frees us from needing to know the size of the vector we are working with, so there is no need for n. We also do not need to pre-allocate our return vector, and since we would now only be accessing y once when we create it, there is no need to store it as a variable, only to then, immediately, return it.

FUNCTION: is_TRUE
INPUTS:   x, a vector.
OUTPUT:   a logical vector whose values are only TRUE or FALSE.

RETURN the x values AND x is a logical vector AND the values of x are not unknown

Results
[1]  TRUE FALSE    NA
[1]  TRUE FALSE FALSE
[1] -1  0  1
[1] FALSE FALSE FALSE

Final Thoughts

You’ll notice we avoided in our pseudocode any mention of indices in our loop, this is by design. It’s not wrong to use indices of iteration in your pseudocode but it can lead to mistakes based on erroneous assumptions and language specific implementation. For instance, some languages (python, C++) are “zero-indexed” meaning the first element of a vector is element 0, so putting in your pseudocode your for loop iterates from 1 to n will definitely cause issues, especially if implemented directly as written. Likewise, a python or C++ programmer who pseudocodes something where the intened range of iteration is over all of the elements after the first, may indicate something like:

n <- number of books on the shelf
for i in 1 to n - 1

Which in R would do everything but the final element (or if you weren’t careful in your implementation), might result in you doing a for loop where i is 1 and 0 when n is 1.

For that reason it is almost always preferable to avoid specifying indices and instead say things like

for each book after the first

Pass 4

While the pseudocode from Pass 3 is what we are looking for from you, if you were working with a much larger function or wanted to do an additional pass prior to coding your function, you could do something like the following. Notice we are still trying to remain as language agnostic as possible by not specifying the index boundaries of x.

Pseudocode (Non-vectorized)

You’ll notice this starts to look very code-like though and borders on not being “pseudo”-code anymore.

FUNCTION: is_TRUE
INPUTS:   x, a vector.
OUTPUT:   a logical vector whose values are only TRUE or FALSE.

y <- a logical vector the same length as x
FOR i in the indices of x
    y[i] <- x[i] AND x[i] is a logical value AND x[i] is not unknown
RETURN y

Good luck!

Writing good pseudocode is not an arcane art, but it does require practice to become comfortable with and good at. In time you’ll write more complete and concise pseudocode easier and earlier, and you will end up doing fewer revisions to get your pseudocode to a place where it is ready to be translated into code.